Multilingual Natural Language Processing
نویسنده
چکیده
With rapidly growing online resources, such as Wikipedia, Twitter, or Facebook, there is an increasing number of languages that have a Web presence, and correspondingly there is a growing need for effective solutions for multilingual natural language processing. In this talk, I will explore the hypothesis that a multilingual representation can enrich the feature space for natural language processing tasks, and lead to significant improvements over traditional solutions that rely exclusively on a monolingual representation. Specifically, I will describe experiments performed on three different tasks: word sense disambiguation, subjectivity analysis, and text semantic similarity, and show how the use of a multilingual representation can leverage additional information from the languages in the multilingual space, and thus improve over the use of only one language at a time. This is joint work with Samer Hassan and Carmen Banea. Bio Rada Mihalcea is an Associate Professor in the Department of Computer Science and Engineering at the University of North Texas. Her research interests are in computational linguistics, with a focus on lexical semantics, graph-based algorithms for natural language processing, and multilingual natural language processing. She serves or has served on the editorial boards of the Journals of Computational Linguistics, Language Resources and Evaluations, Natural Language Engineering, and Research in Language in Computation. She was a program co-chair for the Conference of the Association for Computational Linguistics (2011), and the Conference on Empirical Methods in Natural Language Processing (2009). She is the recipient of a National Science Foundation CAREER award (2008) and a Presidential Early Career Award for Scientists and Engineers (2009).
منابع مشابه
Towards Development of Multilingual Spoken Dialogue Systems
Developing multilingual dialogue systems brings up various challenges. Among them development of natural language understanding and generation components, with a focus on creating new language parts as rapidly as possible. Another challenge is to ensure compatibility between the different language specific components during maintenance and ongoing development of the system. We describe our expe...
متن کاملA multilingual ontology matcher
State-of-the-art multilingual ontology matchers use machine translation to reduce the problem to the monolingual case. We investigate an alternative, self-contained solution based on semantic matching where labels are parsed by multilingual natural language processing and then matched using a language-independent knowledge base acting as an interlingua. As the method relies on the availability ...
متن کاملPrincipled Multilingual Grammars for Large Corpora
We describe a multilingual implementation of such a grammar, and its advantages over both principlebased parsing and ad-hoc grammar design. We show how X-bar theory and language-independent semantic constraints facilitate grammar development. Our implementation includes innovative handling of (1) syntactic gaps, (2) logical structure alternations, and (3) conjunctions. Each of these innovations...
متن کاملMultilingual Natural Language Generation (Experience from AGILE Project)1
Multilingual Natural Language Generation is an interesting and challenging field of Natural Language Processing. Automatic generation of texts in natural language could be viewed as a final part of automated translation process from one language to another. Alternative approach is given the chance with development of modern Natural Language Processing technologies, which concentrate the researc...
متن کامل